Outline

This HTML document contains the output of CATS-rf transcriptome assembly comparison tool. For more details on each table and figure, refer to the tool’s documentation.

General transcriptome assembly statistics

Table 1. General transcriptome assembly statistics.

Parameter RSP_0.005_1_4 RSP_0.01_1_4 RSP_0.02_1_4 RSP_0.005_5_10 RSP_0.01_5_10 RSP_0.02_5_10 RSP_0.005_11_20
N transcripts 27748 30651 36610 18601 18652 19217 18641
Total assembly length (bp) 34167218 32433743 27462499 44994186 44145594 42094487 45586004
N, % transcripts longer than 200 bp 23903, 86.14% 25617, 83.58% 28233, 77.12% 18277, 98.26% 18225, 97.71% 18132, 94.35% 18371, 98.55%
N, % transcripts longer than 500 bp 14553, 52.45% 13923, 45.42% 11423, 31.2% 16878, 90.74% 16738, 89.74% 16335, 85% 17019, 91.3%
N, % transcripts longer than 1000 bp 9887, 35.63% 8846, 28.86% 6486, 17.72% 13488, 72.51% 13356, 71.61% 12875, 67% 13620, 73.06%
N, % transcripts longer than 5000 bp 1036, 3.73% 914, 2.98% 608, 1.66% 2003, 10.77% 1931, 10.35% 1766, 9.19% 2024, 10.86%
N, % transcripts longer than 10000 bp 93, 0.34% 90, 0.29% 59, 0.16% 228, 1.23% 221, 1.18% 202, 1.05% 234, 1.26%
N, % transcripts longer than 20000 bp 4, 0.01% 6, 0.02% 6, 0.02% 10, 0.05% 10, 0.05% 6, 0.03% 10, 0.05%
Mean transcript length (bp) 1231.34 1058.16 750.14 2418.91 2366.8 2190.48 2445.47
Median transcript length (bp) 547.5 434 310 1756 1714 1575 1778
Transcript length IQR (bp) 258-1570 234-1238 206-641 926-3168 899-3096 762-2878 944-3211
Transcript length range (bp) 131-60159 131-58349 131-45657 131-50033 131-50033 131-49970 131-50248
N50 (bp) 2597 2388 1831 3590 3527 3401 3602
L50 3745 3737 3789 3836 3816 3748 3876
N90 (bp) 477 372 249 1215 1194 1135 1224
L90 14985 16939 22581 12212 12182 12096 12283
GC content (%) 49.17% 49.34% 49.68% 48.43% 48.45% 48.60% 48.37%
% of reads mapping to the assembly 97.71% 96.08% 89.99% 99.65% 99.53% 98.51% 99.67%
Parameter RSP_0.01_11_20 RSP_0.02_11_20
N transcripts 18719 19432
Total assembly length (bp) 45065633 43400943
N, % transcripts longer than 200 bp 18312, 97.83% 18210, 93.71%
N, % transcripts longer than 500 bp 16886, 90.21% 16573, 85.29%
N, % transcripts longer than 1000 bp 13517, 72.21% 13131, 67.57%
N, % transcripts longer than 5000 bp 1987, 10.61% 1872, 9.63%
N, % transcripts longer than 10000 bp 230, 1.23% 225, 1.16%
N, % transcripts longer than 20000 bp 12, 0.06% 11, 0.06%
Mean transcript length (bp) 2407.48 2233.48
Median transcript length (bp) 1749 1605
Transcript length IQR (bp) 914-3153.5 776.75-2950
Transcript length range (bp) 132-50317 132-50173
N50 (bp) 3576 3485
L50 3847 3794
N90 (bp) 1214 1160
L90 12234 12208
GC content (%) 48.39% 48.52%
% of reads mapping to the assembly 99.56% 98.64%

IQR = interquartile range

Assembly scores

Table 2. Transcript score component statistics and assembly scores.

Parameter RSP_0.005_1_4 RSP_0.01_1_4 RSP_0.02_1_4 RSP_0.005_5_10 RSP_0.01_5_10 RSP_0.02_5_10 RSP_0.005_11_20
Coverage score component (mean, IQR) 0.571, 0.413-0.747 0.577, 0.425-0.75 0.586, 0.442-0.748 0.88, 0.839-1 0.883, 0.846-1 0.866, 0.851-1 0.895, 0.855-1
Accuracy score component (mean, IQR) 0.923, 0.906-0.944 0.882, 0.862-0.903 0.814, 0.785-0.837 0.941, 0.924-0.967 0.902, 0.889-0.923 0.781, 0.771-0.793 0.944, 0.924-0.972
Local fidelity score component (mean, IQR) 0.82, 0.698-1 0.758, 0.622-0.948 0.647, 0.5-0.817 0.969, 0.952-1 0.958, 0.938-1 0.923, 0.901-1 0.97, 0.956-1
Integrity score component (mean, IQR) 0.714, 0.493-1 0.685, 0.45-1 0.646, 0.408-1 0.92, 0.891-1 0.917, 0.89-1 0.898, 0.885-1 0.919, 0.888-1
Assembly score 0.349 0.3 0.219 0.756 0.719 0.582 0.771
Parameter RSP_0.01_11_20 RSP_0.02_11_20
Coverage score component (mean, IQR) 0.895, 0.862-1 0.871, 0.864-1
Accuracy score component (mean, IQR) 0.908, 0.893-0.93 0.778, 0.768-0.79
Local fidelity score component (mean, IQR) 0.962, 0.944-1 0.934, 0.914-1
Integrity score component (mean, IQR) 0.916, 0.889-1 0.902, 0.883-1
Assembly score 0.734 0.588

IQR = interquartile range

Figure 1. Transcript score distribution.

Coverage and accuracy statistics

Table 3. Coverage and accuracy statistics.

Parameter RSP_0.005_1_4 RSP_0.01_1_4 RSP_0.02_1_4 RSP_0.005_5_10 RSP_0.01_5_10 RSP_0.02_5_10 RSP_0.005_11_20
% of covered bases per transcript (mean, IQR) 98.63%, 100%-100% 98.67%, 100%-100% 98.76%, 100%-100% 98.76%, 100%-100% 98.79%, 100%-100% 97.06%, 100%-100% 98.79%, 100%-100%
N, % fully covered transcripts 23038, 83.03% 25430, 82.97% 30104, 82.23% 15024, 80.77% 14980, 80.31% 15284, 79.53% 15099, 81%
N, % fully uncovered transcripts 36, 0.13% 42, 0.14% 41, 0.11% 63, 0.34% 74, 0.4% 409, 2.13% 59, 0.32%
Mean coverage per transcript (mean, IQR) 6.34, 3.09-7.45 6.32, 3.19-7.19 6.42, 3.32-7.37 33.2, 17.9-40.3 33.47, 17.91-40.55 32.59, 17.32-40.23 42.76, 23.51-51.87
N, % bases with coverage >= 5 23845886, 69.79% 23141946, 71.35% 20451822, 74.47% 41425701, 92.07% 40935879, 92.73% 39431322, 93.67% 42562777, 93.37%
N, % bases with coverage >= 10 13358614, 39.1% 13276215, 40.93% 12654100, 46.08% 37717449, 83.83% 37394816, 84.71% 36423117, 86.53% 39764717, 87.23%
N, % bases with coverage >= 20 4725313, 13.83% 4796435, 14.79% 4864020, 17.71% 29063963, 64.59% 28926419, 65.53% 28482078, 67.66% 33564487, 73.63%
N, % bases with coverage >= 40 839997, 2.46% 877186, 2.7% 971977, 3.54% 15798960, 35.11% 15784592, 35.76% 15709617, 37.32% 19938620, 43.74%
N, % bases with coverage >= 60 209970, 0.61% 208588, 0.64% 230375, 0.84% 9220926, 20.49% 9260066, 20.98% 9253720, 21.98% 13582991, 29.8%
N, % bases with coverage >= 80 80513, 0.24% 79379, 0.24% 77643, 0.28% 5744046, 12.77% 5800514, 13.14% 5827253, 13.84% 8780565, 19.26%
N, % bases with coverage >= 100 35769, 0.1% 35408, 0.11% 33432, 0.12% 3689834, 8.2% 3737815, 8.47% 3788058, 9% 6148524, 13.49%
Maximum uncovered region length per transcript (bp) (mean, IQR) 16.77, 0-0 14.43, 0-0 7.59, 0-0 25.71, 0-0 21.32, 0-0 19.76, 0-0 25.46, 0-0
Mean end coverage per transcript (mean, IQR) 1.88, 1-2.14 1.89, 1-2.09 1.86, 1-2.09 7.46, 2.6-10.39 7.54, 2.81-10.3 7.64, 3.03-10.37 9.51, 3.03-13.57
N, % assembly bases in LCR 6943649, 20.32% 6303356, 19.43% 4826344, 17.57% 2680314, 5.96% 2377600, 5.39% 1960987, 4.66% 2301339, 5.05%
% of bases in LCR per transcript (mean, IQR) 37.1%, 10.96%-55.92% 36.07%, 10.69%-53.57% 34.48%, 10.81%-50.04% 5.4%, 0%-4.17% 5.37%, 0%-3.83% 7.54%, 0%-3.54% 4.43%, 0%-3.32%
LCR length (bp) (mean, IQR) 68.79, 17-75 65.75, 17-70 55.08, 18-63 80.18, 15-72 73.61, 14-66 63.81, 11-57 77.55, 15-67
Coverage score component (mean, IQR) 0.571, 0.413-0.747 0.577, 0.425-0.75 0.586, 0.442-0.748 0.88, 0.839-1 0.883, 0.846-1 0.866, 0.851-1 0.895, 0.855-1
% of accurate bases (bases with accuracy >= 0.95) per transcript (mean, IQR) 97%, 95.85%-98.62% 94.57%, 92.44%-97.1% 89.73%, 85.78%-94.1% 96.61%, 95.63%-98.09% 93.64%, 91.84%-95.91% 86.5%, 83.65%-89.57% 97.23%, 96.48%-98.55%
N, % bases with accuracy >= 0.2 33313030, 99.92% 31601721, 99.86% 26925680, 99.71% 44260356, 99.97% 43518057, 99.95% 41528758, 99.91% 44887538, 99.97%
N, % bases with accuracy >= 0.4 33301040, 99.89% 31577602, 99.78% 26878119, 99.54% 44254180, 99.95% 43509893, 99.93% 41514245, 99.87% 44881465, 99.95%
N, % bases with accuracy >= 0.6 33253513, 99.74% 31503527, 99.55% 26768557, 99.13% 44225909, 99.89% 43473900, 99.85% 41464738, 99.75% 44854147, 99.89%
N, % bases with accuracy >= 0.8 33103327, 99.29% 31281236, 98.84% 26457510, 97.98% 44120949, 99.65% 43341481, 99.54% 41279991, 99.31% 44754124, 99.67%
N, % bases with accuracy >= 0.85 32928658, 98.77% 30989264, 97.92% 26025324, 96.38% 44025332, 99.44% 43196810, 99.21% 41034421, 98.72% 44668563, 99.48%
N, % bases with accuracy >= 0.9 32660971, 97.97% 30512541, 96.41% 25246304, 93.5% 43846171, 99.03% 42885186, 98.5% 40389557, 97.17% 44515186, 99.14%
N, % bases with accuracy >= 0.95 32023143, 96.05% 29335708, 92.7% 23195638, 85.9% 42993455, 97.11% 41164374, 94.54% 36555969, 87.95% 43810794, 97.57%
N, % bases with accuracy >= 0.99 31361027, 94.07% 28186993, 89.07% 21426951, 79.35% 37595926, 84.91% 31577993, 72.53% 22265764, 53.57% 37392839, 83.27%
N, % bases with accuracy >= 1 31347635, 94.03% 28174215, 89.03% 21420763, 79.33% 36213332, 81.79% 30224390, 69.42% 21585912, 51.93% 34989397, 77.92%
N, % assembly bases in LAR 1002861, 3.01% 2441506, 7.71% 7172715, 26.56% 988578, 2.23% 2487513, 5.71% 15740460, 37.87% 961665, 2.14%
% of bases in LAR per transcript (mean, IQR) 2.88%, 1.06%-3.55% 6.9%, 3.88%-8.92% 19.47%, 11.92%-26.44% 2.54%, 0.59%-2.91% 6.53%, 3.96%-7.59% 36.09%, 31.7%-41.24% 2.43%, 0.42%-2.79%
LAR length (bp) (mean, IQR) 4, 1-4 4.49, 1-5 7.42, 1-10 6.89, 1-6 4.93, 1-6 7.63, 1-10 7.7, 1-7
Accuracy score component (mean, IQR) 0.923, 0.906-0.944 0.882, 0.862-0.903 0.814, 0.785-0.837 0.941, 0.924-0.967 0.902, 0.889-0.923 0.781, 0.771-0.793 0.944, 0.924-0.972
Parameter RSP_0.01_11_20 RSP_0.02_11_20
% of covered bases per transcript (mean, IQR) 98.52%, 100%-100% 95.9%, 100%-100%
N, % fully covered transcripts 15188, 81.14% 15317, 78.82%
N, % fully uncovered transcripts 124, 0.66% 634, 3.26%
Mean coverage per transcript (mean, IQR) 42.79, 23.48-52.04 41.54, 22.5-51.51
N, % bases with coverage >= 5 42223041, 93.69% 40990764, 94.45%
N, % bases with coverage >= 10 39486167, 87.62% 38595620, 88.93%
N, % bases with coverage >= 20 33438325, 74.2% 32923129, 75.86%
N, % bases with coverage >= 40 19933054, 44.23% 19625900, 45.22%
N, % bases with coverage >= 60 13545362, 30.06% 13397911, 30.87%
N, % bases with coverage >= 80 8752561, 19.42% 8658865, 19.95%
N, % bases with coverage >= 100 6137951, 13.62% 6111579, 14.08%
Maximum uncovered region length per transcript (bp) (mean, IQR) 23.38, 0-0 21.67, 0-0
Mean end coverage per transcript (mean, IQR) 9.65, 3.26-13.46 9.39, 3.43-13.1
N, % assembly bases in LCR 2152479, 4.78% 1816114, 4.18%
% of bases in LCR per transcript (mean, IQR) 4.74%, 0%-3.05% 7.76%, 0%-2.94%
LCR length (bp) (mean, IQR) 74.09, 13-64 65.02, 11-58
Coverage score component (mean, IQR) 0.895, 0.862-1 0.871, 0.864-1
% of accurate bases (bases with accuracy >= 0.95) per transcript (mean, IQR) 94.7%, 93.38%-96.72% 87.94%, 85.47%-91.02%
N, % bases with accuracy >= 0.2 44403219, 99.95% 42801759, 99.92%
N, % bases with accuracy >= 0.4 44394952, 99.93% 42788744, 99.89%
N, % bases with accuracy >= 0.6 44359662, 99.85% 42742607, 99.78%
N, % bases with accuracy >= 0.8 44234061, 99.57% 42573551, 99.39%
N, % bases with accuracy >= 0.85 44108666, 99.29% 42366969, 98.9%
N, % bases with accuracy >= 0.9 43851329, 98.71% 41830500, 97.65%
N, % bases with accuracy >= 0.95 42372204, 95.38% 38227764, 89.24%
N, % bases with accuracy >= 0.99 30710976, 69.13% 20731797, 48.4%
N, % bases with accuracy >= 1 28447188, 64.03% 19674527, 45.93%
N, % assembly bases in LAR 2304263, 5.19% 16911438, 39.48%
% of bases in LAR per transcript (mean, IQR) 5.91%, 3.27%-6.92% 38.22%, 34.13%-43.07%
LAR length (bp) (mean, IQR) 5.06, 1-6 7.56, 1-10
Accuracy score component (mean, IQR) 0.908, 0.893-0.93 0.778, 0.768-0.79

IQR = interquartile range, LCR = low-coverage region, LAR = low-accuracy region

Figure 2. Per-base coverage category distribution.

Figure 3. Proportion of covered bases per transcript category distribution.

Figure 4. Mean transcript coverage category distribution.

Figure 5. Positional relative coverage distribution.

Figure 6. Maximum uncovered region length per transcript distribution.

Figure 7. Mean transcript end coverage per transcript distribution.

Figure 8. Proportion of bases in low-coverage regions per transcript category distribution.

Figure 9. Low-coverage region length distribution.

Figure 10. Coverage score component distribution.

Figure 11. Per-base accuracy category distribution.

Figure 12. Proportion of accurate bases per transcript category distribution.

Figure 13. Positional accuracy distribution.

Figure 14. Proportion of bases in low-accuracy regions per transcript category distribution.

Figure 15. Low-accuracy region length distribution.

Figure 16. Accuracy score component distribution.

Paired-end read analysis

Table 4. Local fidelity and integrity statistics.

Parameter RSP_0.005_1_4 RSP_0.01_1_4 RSP_0.02_1_4 RSP_0.005_5_10 RSP_0.01_5_10 RSP_0.02_5_10 RSP_0.005_11_20
N, % reads with pair not mapped to the assembly 44671, 1.23% 66086, 1.85% 117364, 3.5% 41295, 0.22% 51628, 0.28% 96421, 0.53% 51420, 0.21%
N, % reads with pair not mapped to the assembly on transcript ends 16728, 10.02% 26399, 13.32% 52156, 19.52% 8223, 2.28% 9713, 2.63% 16557, 4.28% 11085, 2.43%
% of reads with pair not mapped to the assembly on transcript ends per transcript (mean, IQR) 11.56%, 0%-20% 15.34%, 0%-25% 22.31%, 0%-33.33% 2.5%, 0%-0% 3.51%, 0%-0% 6.09%, 0%-7.69% 2.31%, 0%-0%
N, % reads with pair mapped in an unexpected orientation 4, 0% 14, 0% 0, 0% 642, 0% 2, 0% 542, 0% 80, 0%
N, % reads with pair mapped too far apart 6080, 0.17% 8986, 0.25% 29414, 0.88% 21110, 0.11% 21000, 0.11% 22020, 0.12% 27712, 0.12%
N, % improperly paired reads within a transcript 50755, 1.39% 75086, 2.1% 146778, 4.38% 63047, 0.34% 72630, 0.39% 118983, 0.65% 79212, 0.33%
% of improperly paired reads within a transcript per transcript (mean, IQR) 7.22%, 0%-9.09% 10.55%, 0.27%-14.29% 17.62%, 3.43%-25% 0.44%, 0%-0.23% 0.63%, 0%-0.38% 1.46%, 0%-0.98% 0.41%, 0%-0.19%
Local fidelity score component (mean, IQR) 0.82, 0.698-1 0.758, 0.622-0.948 0.647, 0.5-0.817 0.969, 0.952-1 0.958, 0.938-1 0.923, 0.901-1 0.97, 0.956-1
N, % reads with pair mapped to another transcript 92682, 2.54% 105812, 2.95% 137456, 4.1% 288946, 1.56% 287574, 1.56% 275434, 1.51% 379216, 1.58%
% of reads with pair mapped to another transcript per transcript (mean, IQR) 14.4%, 0%-22.22% 15.73%, 0%-25% 17.42%, 0%-26.67% 3.26%, 0%-2.69% 3.44%, 0%-2.74% 4.9%, 0%-2.92% 3.28%, 0%-2.82%
N, % fragmented transcripts 571, 2.06% 818, 2.67% 1499, 4.09% 727, 3.91% 704, 3.77% 707, 3.68% 808, 4.33%
N, % reads representing bridging events on transcript ends 9074, 5.44% 12520, 6.32% 20270, 7.58% 6614, 1.83% 6212, 1.68% 6478, 1.68% 8486, 1.86%
% of reads representing bridging events on transcript ends per transcript (mean, IQR) 6.36%, 0%-0% 7.19%, 0%-12.5% 8.24%, 0%-14.29% 1.88%, 0%-0% 1.76%, 0%-0% 1.83%, 0%-0% 1.81%, 0%-0%
Integrity score component (mean, IQR) 0.714, 0.493-1 0.685, 0.45-1 0.646, 0.408-1 0.92, 0.891-1 0.917, 0.89-1 0.898, 0.885-1 0.919, 0.888-1
Parameter RSP_0.01_11_20 RSP_0.02_11_20
N, % reads with pair not mapped to the assembly 60936, 0.25% 110944, 0.47%
N, % reads with pair not mapped to the assembly on transcript ends 10830, 2.33% 17004, 3.54%
% of reads with pair not mapped to the assembly on transcript ends per transcript (mean, IQR) 2.97%, 0%-0% 4.97%, 0%-5.26%
N, % reads with pair mapped in an unexpected orientation 48, 0% 76, 0%
N, % reads with pair mapped too far apart 27256, 0.11% 27666, 0.12%
N, % improperly paired reads within a transcript 88240, 0.37% 138686, 0.58%
% of improperly paired reads within a transcript per transcript (mean, IQR) 0.51%, 0%-0.32% 1.08%, 0%-0.75%
Local fidelity score component (mean, IQR) 0.962, 0.944-1 0.934, 0.914-1
N, % reads with pair mapped to another transcript 378108, 1.57% 367392, 1.54%
% of reads with pair mapped to another transcript per transcript (mean, IQR) 3.51%, 0%-2.78% 4.66%, 0%-3.01%
N, % fragmented transcripts 802, 4.28% 806, 4.15%
N, % reads representing bridging events on transcript ends 8146, 1.75% 8818, 1.83%
% of reads representing bridging events on transcript ends per transcript (mean, IQR) 1.79%, 0%-0% 1.84%, 0%-0%
Integrity score component (mean, IQR) 0.916, 0.889-1 0.902, 0.883-1

IQR = interquartile range

Figure 17. Per-transcript proportion of improperly paired reads within a transcript category distribution.

Figure 18. Local fidelity score component distribution.

Figure 19. Per-transcript proportion of reads with pair mapped to another transcript category distribution.

Figure 20. Integrity score component distribution.